\section{Lower bound for the strongly adaptive adversary model}
\label{sec:lower} 

In this section, we prove Theorem \ref{thm:alg+lower}. 
We first define the adversary used in the proof of Theorem
\ref{thm:alg+lower}. 

%\tAlgLower

\smallskip
\noindent {\em Adversary:} The strategy of the adversary is simple.
We use the notion of {\em free edge}\/ introduced
in~\cite{kuhn+lo:dynamic}.  In a given round $r$, we call an edge
$(u,v)$ \emph{free} \junk{
\footnote{The notion of a free edge is borrowed from
  \cite{kuhn+lo:dynamic}.  Apart from this notion, our proof technique
  is different from that of \cite{kuhn+lo:dynamic} where only a an
  $\Omega(n\log n)$ lower bound is shown.}} if at the start of the
round, $u$ has the token that $v$ broadcasts in the round and $v$ has
the token that $u$ broadcasts in the round; an edge that is not free
is called {\em non-free}.
%(For convenience, when a node does not
%broadcast any token we will view it as broadcasting a
%special {\em empty}\/ token that every node has.)
Thus, if $(u,v)$ is a free edge in a particular round, neither $u$ nor
$v$ can gain any new token through this edge in the round. Since we
are considering a strong adversary model, at the start of each round,
the adversary knows for each node $v$, the token
%(if any)
that $v$ will broadcast in that round.  In round $r$, the adversary
constructs the communication graph $G_r$ as follows. First, the
adversary adds all the free edges to $G_r$. Let $C_1,C_2,\dots,C_l$
denote the connected components thus formed. The adversary then
guarantees the connectivity of the graph by selecting an arbitrary
node in each connected component and connecting them in a line. 
Figure\ref{fig:adversary} illustrates the construction.

The network $G_r$ thus constructed has exactly $l - 1$ non-free edges,
where $l$ is the number of connected components formed by the free
edges of $G_r$.  If $(u,v)$ is a non-free edge in $G_r$, then $u$, $v$
will gain at most one new token each through $(u,v)$. We refer to this
exchange on a non-free edge as a {\em useful token exchange}.

\smallskip
%\noindent {\em Overview of the proof.} 
Our proof proceeds as follows. First, we show that with high
probability over the initial assignment of tokens, in every round
there are at most $O(\lg n)$ useful token exchanges. Then we note
that, again with high probability over the initial assignment of
tokens, overall $\Omega(nk)$ useful token exchanges must occur for the
protocol to complete.

\smallskip

%We bound the running-time of any token-forwarding algorithm by
%identifying a critical structure that quantifies the progress made in
%each round.

\begin{definition}
We say that a sequence of nodes $v_1,v_2,\ldots,v_k$ is {\em
  half-empty}\/ in round $r$ with respect to a sequence of tokens
$t_1,t_2,\ldots,t_k$ if the following condition holds at the start of
round $r$: for all $1 \le i,j \le k$, $i \neq j$, either $v_i$ is
missing $t_j$ or $v_j$ is missing $t_i$.  We then say that $\langle
v_i \rangle$ is half-empty with respect to $\langle t_i \rangle$ and
refer to the pair $(\langle v_i \rangle, \langle t_i \rangle)$ as a
half-empty configuration of size $k$.
\end{definition}

\vspace{-5pt}
\begin{figure*}[ht]
\begin{center}
\includegraphics[width=4in]{adversary.jpg}
\caption{\small{The network constructed by the adversary in a particular
  round.  Note that if node $v_i$ broadcasts token $t_i$, then the
  $\langle v_i \rangle$ forms a half-empty configuration with respect
  to $\langle t_i \rangle$ at the start of this round.}}
\label{fig:adversary}
\end{center}
\vspace{-1cm}
\end{figure*}

\junk{To prove the lower bound, we consider a special starting state, where
each node $u$ has token $t_i$ with probability $3/4$ for all $i$. Then
we look for certain structure in this starting state, and argue that
with such structure the number of useful token exchanges is $O(\log
n)$ with high probability. Furthermore, we argue that such structure
will always exist in the following rounds of communication. That
means, the number of useful token exchanges will be $O(\log n)$ in
every round. Since each node $u$ has any token with probability $3/4$
at the starting point, the expected number of missing token is $n\cdot
k/4 = \Omega(kn)$. In fact, we argue the number of missing token is
$\Omega(kn)$ with high probability using Chernoff bound. Thus, this
will imply any centralized deterministic algorithm runs in
$\Omega(kn/\log n)$ rounds.

First, we draw the connection between the number of useful token
exchanges and the existence of certain structure in Lemma
\ref{lem:free+edge}. Then in Lemma \ref{lem:alg+lower} we show the
$\Omega(n + nk/\log n)$ lower bound if the token dissemination process
starts from the state where each node $u$ has token $t_i$ with
probability $3/4$ for all $i$. Lastly, in Theorem
\ref{thm:alg+lower.single} we prove our lower bound.
\begin{lemma}
\label{lem:free+edge}
If $m$ or more useful token exchanges occur, then there exist $m/2+1$
nodes $v_1,v_2,\dots,v_{m/2+1}$ and $m/2+1$ tokens
$t_1,t_2,\dots,t_{m/2+1}$ ($v_i$ broadcasts token $t_i$) such that,
for all $i<j$, either $v_i$ is missing $t_j$ or $v_j$ is missing
$t_i$.
\end{lemma}
}

\begin{lemma}
\label{lem:free+edge}
If $m$ useful token exchanges occur in round $r$, then there exists a
half-empty configuration of size at least $m/2 + 1$ at the start of
round $r$.
\end{lemma}
\begin{proof}
Consider the network $G_r$ in round $r$.  Each non-free edge can
contribute at most 2 useful token exchanges.  Thus, there are at least
$m/2$ non-free edges.  Based on the adversary we consider, no useful
token exchange takes place within the connected components induced by
the free edges. Useful token exchanges can only happen over the
non-free edges between connected components. This implies there are at
least $m/2+1$ connected components in the subgraph of $G_r$ induced by
the free edges.  Let $v_i$ denote an arbitrary node in the $i$th
connected component in this subgraph, and let $t_i$ be the token
broadcast by $v_i$ in round $r$.  For $i \neq j$, since $v_i$ and
$v_j$ are in different connected components, $(v_i,v_j)$ is a non-free
edge in round $r$; hence, at the start of round $r$, either $v_i$ is
missing $t_j$ or $v_j$ is missing $t_i$.  Thus, the sequence $\langle
v_i \rangle$ of nodes of size at least $m/2 + 1$ is half-empty with
respect to the sequence $\langle t_i \rangle$ at the start of round
$r$.
\end{proof}

An important point to note about the definition of a half-empty
configuration is that, in a given round, it only depends on the tokens
held by the nodes; it is independent of the tokens that the nodes
broadcast.  This allows us to prove the following easy lemma that
shows a monotonicity property of half-empty configurations.
\begin{lemma}[Monotonicity Property]
\label{lem:monotone}
If a sequence $\langle v_i \rangle$ of nodes is half-empty with
respect to $\langle t_i \rangle$ at the start of round $r$, then
$\langle v_i \rangle$ is half-empty with respect to $\langle t_i
\rangle$ at the start of round $r'$ for any $r' \le r$.  Hence, the
size of the largest half-empty configuration cannot increase with the
increase in the number of rounds.
\end{lemma}
\begin{proof}
The lemma follows by noting that if a node $v_i$ is missing a token
$t_j$ at the start of round $r$, then $v_i$ is missing token $t_j$ at
the start of every round $r' < r$.
\end{proof}

Lemmas~\ref{lem:free+edge} and~\ref{lem:monotone} suggest that if we
can identify a token distribution in which all half-empty
configurations are small, we can guarantee small progress in each
round.  We now show that a well-mixed distribution satisfies the
desired property, establishing part (b) of the theorem. \junk{Note
  that the "high probability" bound in the following theorem is with
  respect to the random choices made in the initial token placement,
  and thus shows, by the probabilistic method, the lower bound holds
  for most of the initial token distributions.}

\begin{proof}[Proof of Theorem \ref{thm:alg+lower}(b)]
We first note that if the number of tokens $k$ is less
than $100 \log n$, then the $\Omega(n + nk/\log n)$ lower
bound is trivially true because even to disseminate one
token on a line it takes $\Omega(n)$ rounds\footnote{ The choice of
  the constant 100 here is arbitrary; we have not optimized the choice
  of constants in the proof.}. Thus, in the
following proof, we focus on the case where $k \ge 100
\log n$.

Let $E_l$ denote the event that there exists a half-empty
configuration of size $l$ at the start of the first round.  For $E_l$
to hold, we need $l$ nodes $v_1, v_2, \dots, v_l$ and $l$ tokens
$t_1, t_2, \dots, t_l$ such that for all $i \neq j$ either $v_i$ is
missing $t_j$ or $v_j$ is missing $t_i$. For a pair of nodes $u$ and
$v$, by union bound, the probability that $u$ is missing $t_v$ or $v$
is missing $t_u$ is at most $1/4+1/4 = 1/2$. Thus, the probability of
$E_l$ can be bounded as follows.
\[ \prob{E_l} \le {n \choose l} \cdot \frac{k!}{(k-l)!} \cdot \rb{\frac{1}{2}}^{l \choose 2} \le  n^l \cdot k^l \frac{1}{2^{l(l-1)/2}} \le \frac{2^{2l\log n}}{2^{l(l-1)/2}}.\]
In the above inequality, ${n \choose l}$ is the number of ways of
choosing the $l$ nodes that form the half-empty configuration,
$k!/(k-l)!$ is the number of ways of assigning $l$ distinct tokens,
and $(1/2)^{{l \choose 2}}$ is the upper bound on the probability for
each pair $i \neq j$ that either $v_i$ is missing $t_j$ or $v_j$ is
missing $t_i$.  For $l \geq 5 \log n$, $\prob{E_l} \le 1/n^2$.  Thus,
the largest half-empty configuration at the start of the first round,
and hence at the start of {\em any} round (by Lemma
\ref{lem:monotone}), is of size at most $5 \log n$ with probability at
least $1 - 1/n^2$.  By Lemma~\ref{lem:free+edge}, we thus obtain that
the number of useful token exchanges in each round is at most $10 \log
n$, with probability at least $1 - 1/n^2$.

\junk{Now we argue, in each following rounds, the number of useful token
exchanges is also no more than $2(l-1)$ with high probability, where $l\ge
5\log n$. If there are $2(l-1)$ or more useful token exchanges in
round $r$ where $r>1$, then by Lemma \ref{lem:free+edge} there exist
$l$ nodes $v_1,v_2,\dots,v_l$ and $l$ tokens $t_1,t_2,\dots,t_l$ such
that for all $i \neq j$ either $v_i$ is missing $t_j$ or $v_j$ is
missing $t_i$. Then at the beginning of round 1, the condition that
for all $i \neq j$ either $v_i$ is missing $t_j$ or $v_j$ is missing
$t_i$ still holds, and this can happen only with probability $1/n^2$.}

Let $M_i$ be the number of tokens missing at node $i$ in the initial
distribution. Then $M_i$ is a binomial random variable with
$\expect{M_i} = k/4$.  By a Chernoff bound, the probability that node
$i$ misses at most $k/8$ tokens is
\[ \prob{M_i \le \frac{k}{8}} = \prob{M_i \le \rb{1 - \frac{1}{2}} \cdot \expect{M_i}} \le e^{-\frac{\expect{M_i}\rb{\frac{1}{2}}^2}{2}} = e^{-\frac{k}{32}}.\]
Thus, the total number of tokens missing in the initial
distribution is at least $n \cdot k/8 = \Omega(kn)$ with
probability at least $1 - n/e^{\frac{k}{32}} \ge 1 -
1/n^2$ ($k \ge 100 \log n$).  Since the number of useful
tokens exchanged in each round is at most $10 \log n$,
the number of rounds needed to complete $k$-gossip is
$\Omega(kn /\log n)$ with high probability. 
\end{proof}

Part (b) of Theorem~\ref{thm:alg+lower} does not apply to some natural
initial distributions, such as one in which each token resides at
exactly one node.  When starting from a distribution in this class,
though there are far fewer tokens distributed initially, the argument
above does not rule out the possibility that an algorithm avoids the
problematic configurations that arise in the proof.  Part (a) of
Theorem~\ref{thm:alg+lower} extends the lower bound to this class of
distributions. The main idea of the proof is showing that a reduction
exists (via the probabilistic method) to an initial well-mixed
distribution of Theorem~\ref{thm:alg+lower}.

\junk{\begin{theorem}
\label{thm:lower.single}
Starting from any distribution in which each token starts at exactly one node,
any online token-forwarding algorithm for $k$-gossip needs $\Omega(n +
nk/\log n)$ rounds against a strong adversary.
\end{theorem}
}

\begin{lemma}
\label{lem:alg+lower.single}
From any distribution in which each token starts at exactly one node
and no node has more than one token, any online token-forwarding
algorithm for $k$-gossip needs $\Omega(kn/\log n)$ rounds against a
strong adversary.
\end{lemma}
\begin{proof}
We consider an initial distribution $C$ where each token is at exactly
one node, and no node has more than one token. Let $C^*$ be an initial
token distribution in which each node has each token independently
with probability $3/4$.  By Theorem~\ref{thm:alg+lower}, any online
algorithm starting from distribution $C^*$ needs $\Omega(kn/\log n)$
rounds with high probability.

We construct a bipartite graph on two copies of $V$, $V_1$ and
$V_2$. A node $v \in V_1$ is connected to a node $u \in V_2$ if in
$C^*$ $u$ has all the tokens that $v$ has in $C$.  We first show,
using Hall's Theorem, that this bipartite graph has a perfect matching
with very high probability.  Consider a set of $m$ nodes in $V_2$. We
want to show their neighborhood in the bipartite graph is of size at
least $m$. We show this condition holds by the following 2 cases. If
$m < 3n/5$, let $X_i$ denote the neighborhood size of node $i$. We
know $\expect{X_i} \ge 3n/4$. Then by Chernoff bound
\[ \prob{X_i < m} \le \prob{X_i < 3n/5} \le e^{-\frac{\rb{1/5}^2 \expect{X_i}}{2}} = e^{-\frac{3n}{200}}.\]
By union bound with probability at least $1-n\cdot e^{-3n/200}$ the
neighborhood size of every node is at least $m$. Therefore, the
condition holds in the first case. If $m \ge 3n/5$, we argue that the
neighborhood size of any set of $m$ nodes from $V_2$ is $V_1$ with high
probability. Consider a set of $m$ nodes, the probability that a given
token $t$ is missing in all these $m$ nodes is $(1/4)^m$. Thus the
probability that any token is missing in all these nodes is at most
$n(1/4)^m \le n(1/4)^{3n/5}$. There are at most $2^n$ such sets. By
union bound, with probability at least $1-2^n\cdot n(1/4)^{3n/5} =
1-n/2^{n/5}$, the condition holds in the second case.

By applying the union bound, we obtain that with positive probability
(in fact, high probability), $C^*$ takes $\Omega(nk/\log n)$ rounds
and there is a perfect matching $M$ in the above bipartite graph.  By
the probabilistic method, thus both $C^*$ and $M$ exist.  Given such
$C^*$ and $M$, we complete the proof as follows.  For $v\in V_2$, let
$M(v)$ denote the node in $V_1$ that got matched to $v$.  If there is
an algorithm $A$ that runs in $T$ rounds from starting state $C$, then
we can construct an algorithm $A^*$ that runs in the same number of
rounds from starting state $C^*$ as follows. First every node $v$
deletes all its tokens except for those which $M(v)$ has in $C$. Then
algorithm $A^*$ runs exactly as $A$.  Thus, the lower bound of
Theorem~\ref{thm:alg+lower}, which applies to $A^*$ and $C^*$, also
applies to $A$ and $C$. 
\end{proof}

\begin{proof}[Proof of Theorem~\ref{thm:alg+lower}(a)]
We extend our proof in Lemma~\ref{lem:alg+lower.single} to the inital
distibution $C$ where each token starts at exactly one node, but nodes
may have multiple tokens.  We consider the following two cases.

The first case is when at least $n/2$ nodes start with some
token. This implies that $k\ge n/2$.  Let us focus on the $n/2$ nodes
with tokens. Each of them has at least one unique token. By the same
argument used in Lemma~\ref{lem:alg+lower.single}, disseminating these
$n/2$ distinct tokens to $n$ nodes takes $\Omega(n^2/\log n)$
rounds. Thus, in this case the number of rounds needed is
$\Omega(kn/\log n)$.

The second case is when less than $n/2$ nodes start with some
token. In this case, the adversary can group these nodes together, and
treat them as one super node. There is only one edge connecting this
super node to the rest of the nodes. Thus, the number of useful token
exchanges provided by this super node is at most one in each round. If
there exsits an algorithm that can disseminate $k$ tokens in
$o(kn/\log n)$ rounds, then the contribution by the super node is
$o(kn/\log n)$. And by the same argument used in
Lemma~\ref{lem:alg+lower.single} we know dissemination of $k$ tokens
to $n/2$ nodes (those start with no tokens) takes $\Omega(kn/\log n)$
rounds. Thus, the theorem also holds in this case. 
\end{proof}
